This is to check if BMI is a confouding factor between disease and gene expressions.

The counts tables were generated previously using HTseq.sh shell script and merge.command on Hera. It removes the outlier data point from the analysis, which was patient 29. This script requires a transcript counts table. There is also a sample mapping file called patient_sample_mapping.csv which links the diagnosis the the samples. This file was most recently processed on Tue Sep 9 15:03:58 2014.

This step gets the protein coding genes only.

Full Analysis

plot of chunk deseq-analysis adjusting for age

## [1] 35
##  [1] "Intercept"                   "BMI1.0.25."                 
##  [3] "BMI1.25.30."                 "BMI1.30.50."                
##  [5] "groupControl"                "groupAcromegaly"            
##  [7] "BMI1.0.25..groupControl"     "BMI1.25.30..groupControl"   
##  [9] "BMI1.30.50..groupControl"    "BMI1.0.25..groupAcromegaly" 
## [11] "BMI1.25.30..groupAcromegaly" "BMI1.30.50..groupAcromegaly"
## [1] "acro.disease.normal <- results(acromegaly.cds.bmi, contrast=list(\"BMI.30.50..groupAcromegaly\", \"BMI.0.25..groupAcromegaly\"))\nsum(acro.disease.normal$padj<.05, na.rm=TRUE)\nacro.disease.overweight <- results(acromegaly.cds.bmi, contrast=list(\"BMI.25.30..groupAcromegaly\", \"BMI.25.30..groupControl\"))\nacro.disease.obese <- results(acromegaly.cds.bmi, contrast=list(\"BMI.30.50..groupAcromegaly\", \"BMI.30.50..groupControl\"))\n#acro.bmi.combined <- merge(acromegaly.results.bmi, acro.bmi.contrast, by.x=\"row.names\", by.y=\"row.names\")\nacro.bmi.combined <- cbind(acro.disease.normal, acro.disease.overweight)\ncolnames(acro.bmi.combined)[2:7]<- gsub(\"x\",\"AcrovsControl\", colnames(acro.bmi.combined)[2:7])\ncolnames(acro.bmi.combined)[8:13]<-gsub(\"y\",\"Obese_vs_Overweight\", colnames(acro.bmi.combined[8:13]))\nacro.bmi.combined <- acro.bmi.combined[order(acro.bmi.combined$padj.bmi, na.last=TRUE),]\n#acromegaly.results.bmi <- acromegaly.results.bmi[order(acromegaly.results.bmi$padj, na.last=TRUE),]\nwrite.csv(acro.bmi.combined, \"../data/processed/acromegaly_results_GRCh37.74-adjusting_bmi.csv\")"

plot of chunk deseq-analysis adjusting for age

## [1] 177
## [1] "Intercept"                   "BMI.cat_Obese_vs_Not_obese" 
## [3] "group_Cushing.s_vs_Control"  "BMI.catObese.groupCushing.s"

plot of chunk deseq-analysis adjusting for age

## [1] 1138
## [1] 6

plot of chunk graphs

## pdf 
##   2
## pdf 
##   2

plot of chunk graphs

## Importance of components:
##                          PC1    PC2    PC3     PC4     PC5      PC6
## Standard deviation     3.948 0.4963 0.3532 0.19841 1.7e-15 1.23e-15
## Proportion of Variance 0.974 0.0154 0.0078 0.00246 0.0e+00 0.00e+00
## Cumulative Proportion  0.974 0.9897 0.9975 1.00000 1.0e+00 1.00e+00
##                             PC7      PC8      PC9     PC10     PC11
## Standard deviation     1.09e-15 3.44e-16 1.16e-16 9.35e-17 8.56e-17
## Proportion of Variance 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00
## Cumulative Proportion  1.00e+00 1.00e+00 1.00e+00 1.00e+00 1.00e+00
##                            PC12     PC13     PC14     PC15     PC16
## Standard deviation     6.89e-17 5.98e-17 5.79e-17 5.11e-17 4.64e-17
## Proportion of Variance 0.00e+00 0.00e+00 0.00e+00 0.00e+00 0.00e+00
## Cumulative Proportion  1.00e+00 1.00e+00 1.00e+00 1.00e+00 1.00e+00

plot of chunk pca

## pdf 
##   2

plot of chunk pca

## pdf 
##   2

plot of chunk pca

## pdf 
##   2

Clustering

plot of chunk kmeans

## pdf 
##   2
## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing

plot of chunk Cushing effect on obese and not obese patients

## [1] 104
## [1] "Intercept"      "groupControl"   "groupCushing.s"

plot of chunk Cushing effect on obese and not obese patients

## estimating size factors
## estimating dispersions
## gene-wise dispersion estimates
## mean-dispersion relationship
## final dispersion estimates
## fitting model and testing

plot of chunk Cushing effect on obese and not obese patients

## [1] 795
## [1] "Intercept"      "groupControl"   "groupCushing.s"

plot of chunk Cushing effect on obese and not obese patients

Annotation

This step annotates the data tables with the official gene symbols.

Session Information

For the R session, the package versions were:

sessionInfo()
## R version 3.1.0 (2014-04-10)
## Platform: x86_64-apple-darwin10.8.0 (64-bit)
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] parallel  stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] RColorBrewer_1.0-5      scatterplot3d_0.3-35   
##  [3] reshape2_1.4            ggplot2_1.0.0          
##  [5] biomaRt_2.20.0          DESeq2_1.4.5           
##  [7] RcppArmadillo_0.4.320.0 Rcpp_0.11.2            
##  [9] GenomicRanges_1.16.3    GenomeInfoDb_1.0.2     
## [11] IRanges_1.22.9          BiocGenerics_0.10.0    
## 
## loaded via a namespace (and not attached):
##  [1] annotate_1.42.1      AnnotationDbi_1.26.0 Biobase_2.24.0      
##  [4] colorspace_1.2-4     DBI_0.2-7            digest_0.6.4        
##  [7] evaluate_0.5.5       formatR_0.10         genefilter_1.46.1   
## [10] geneplotter_1.42.0   grid_3.1.0           gtable_0.1.2        
## [13] htmltools_0.2.4      knitr_1.6            lattice_0.20-29     
## [16] locfit_1.5-9.1       MASS_7.3-33          munsell_0.4.2       
## [19] plyr_1.8.1           proto_0.3-10         RCurl_1.95-4.1      
## [22] rmarkdown_0.2.50     RSQLite_0.11.4       scales_0.2.4        
## [25] splines_3.1.0        stats4_3.1.0         stringr_0.6.2       
## [28] survival_2.37-7      tools_3.1.0          XML_3.98-1.1        
## [31] xtable_1.7-3         XVector_0.4.0        yaml_2.1.13